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1. Introduction 


As pointed out by Torre and Poggio (1984) many problems of early vision are ill-posed; 
unique stable solutions can be recovered by several regularization techniques, in particular 
by standard regularization due mainly to Tikhonov (1943, 1963). Providing that solutions 
belong to suitable compact sets, these techniques can be successfully applied to a broad 
class of problems (for a brief review see Poggio, Torre and Koch, 1985), such as surface 
interpolation (Grimson 1981, 1982; Terzoupulos 1984), computation of visual motion (Horn 
and Shunck 1981, Hildreth 1984), recovering shape from shading (Ikeuchi and Horn 1981), 
lightness (Horn 1974) and edge detection (Torre and Poggio 1986). 

According to standard regularization theory, stable solutions can be recovered quite 
simply if they belong to a compact set. Otherwise standard regularization techniques have 
to be applied. These methods search for a solution as close as possible to the data and 
belonging to a compact set defined by a suitable stabilizing functional. In both cases, as we 
will see in detail, the concept of compact set plays a key role. Very often, however, some 
additional constraints on the shape of the possible solutions are available: for example the 
solutions may belong to the set of positive functions, as in the case of lightness, or may be 
bounded by the values of some known functions or may be piece-wise continuous or piece- 
wise constant as in some instances of surface interpolation. More generally it can be said 
that these constraints define a certain subset in a suitable functional space. Rutman and 
Cabral (1974) have shown that combining regularization techniques and shape contraints 
improves the correctness of the numerical solution in linear integral problems. 

In this note, after a brief review of ill-posedness in functional spaces and in 3f n , we show 
which of these constraints can be embedded in the classic regularization theory, and how. 
Two cases are considered in detail. In the first one, shape constraints, forcing the solution 
to belong to a compact set, allow a straightforward regularization of the problem. In the 
second, more general case, shape constraints define closed sets that can be incorporated into 
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the framework of classical regularization theory, where an appropriate stabilizing functional 
constrains the solution to a compact set, providing a simple way in which some a priori 
knowledge can be taken into account. Some functional subsets corresponding to interesting 
shape constraints are considered. 

We also answer questions arising in the numerical solutions of regularized problems. 
Since regularization with shape constraints is a problem of constrained minimization, we 
discuss in some detail the relationship with mathematical programming. 

Our main conclusion is that shape constraints can be applied in regularization theory 
provided they define compact or at least closed subsets. The constraints involving disconti¬ 
nuities do not fit into this schema while, for example, monotonicity, convexity and positivity 
constraints do. 


2. Overview: ill-posed problems in infinite and finite dimensional 

spaces 

In this section we review briefly the main problems involved in the ill-posedness of equations 
in infinite and finite dimensional spaces. We introduce the concepts of normal solution and 
quasi-solution and show the connection with uniqueness and existence of the solution to a 
given problem. Relationships between ill-conditioned and ill-posed problems in the discrete 
case are also examined. 

2.1. Ill-posed problems in Hilbert spaces 

Let us consider the problem of solving the equation 


Ax = y 


( 2 . 1 . 1 ) 
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for x, where x and y belong to X and Y, Hilbert spaces. The operator A, defined on 
D(A) C X, maps D(A) onto R(A) C Y. In many applications it is required that the the 
solution to (2.1.1) i) exists, ii) is unique and Hi) depends continuously on y. A problem, 
whose solution satisfies i ), ii) and Hi) is said to be well-posed (Hadamard, 1923); otherwise 
it is said to be ill-posed. Notice that Hi) may depend on the choice of the metric in X and 
in Y. 

If A is linear, continuous, injective and R(A) = Y, the problem of solving (2.1.1) for 
x is trivially well-posed: indeed, since A is a bijection between D(A) and Y, existence and 
uniqueness of the solution are guaranteed. Moreover x depends continuously on y because, 
when R(A) = Y, A -1 is continuous (Riesz and Nagy, 1952). 

If A is linear but not injective, the solution to the problem of (2.1.1) is no longer unique. 
Uniqueness of the solution can be easily recovered, for instance, by introducing the concept 
of normal solution. The normal solution x n to (2.1.1) is the solution orthogonal to the null 
space of A, N(A). It is easy to see that x n is unique and that it can be characterized as 
the minimum norm solution. If A is injective, the normal solution and the usual solution 
coincide. 

If we relax the condition R(A) — Y other problems arise. The solution to (2.1.1) may 
no longer exist since y may not belong to 72(A). For example the data y may be affected by 
an error 6y belonging to the orthogonal complement to the range of A, 72(A) -1 -. It is useful 
then, to introduce the concept of quasi-solution (see, for example, Tikhonov and Arsenin, 
1977). Let P be an operator that projects Y onto 72(A), then x, the solution to the equation 

Ax = Py (2.1.2) 

is called a quasi-solution of (2.1.1). It is obvious that x exists if y E 72(A) ® 72(A) -1- . Notice 
that if y € 72(A), the quasi-solution and the solution to (2.1.1) coincide. 

Therefore if A is linear, continuous and 72(A) is closed, the problem of finding a normal 
quasi-solution to the equation (2.1.1) is well-posed, since the normal quasi-solution always 
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exists, is unique and depends continuously on y. (This last condition follows directly from 
the continuity of the quasi-inverse A + of A, A + being defined as the operator that maps 
y G R(A) ® R(A) 1 - into the corresponding normal quasi-solution of (2.1.2).) 

In many practical cases, however, R(A ) is not closed (Kolmogorov and Fomine, 1980). 
So even the quasi-solution may not exist and if it exists can be unstable. Consider, for 
example, the Fredholm integral equation of the first kind 

6 

J K(t,s)x(s)ds = y(t) c < t < d. (2.1.3) 

a 

The function 


is a solution to (2.1.3) with 


x(s) = x(s) + N sinws 


0 

y(t) = y{t) + N J K(t, s) sin(u>s)ds. 


In the usual L 2 metric \\y — y|| -> 0 as w 00 (for the Riemann Lebesgue theorem) while 
||x — re|| ~ N. So with a suitable choice of N and lo the error on the data can be made 
arbitrarily small, while the distance between the solution can be arbitrarily large. 
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2.2. Ill-posed problems in 9? n 

Let us consider the system of equations 


Ax = y (2-2.1) 

where A is a n x n matrix and x and y vectors belonging to 9£ n . The problem of recovering 
x given A and y is that of finding the inverse matrix A -1 of A. If the determinant of A 
is equal to zero, the problem has no solution and the system is called singular. If A is 
diagonalizable and some eigenvalues are much smaller than the others, the system is said 
to be ill-conditioned (Strang 1976), since small errors in the data y lead to unacceptable 
indeterminacy in the components of the solution x. In such cases the ratio between the 
largest and the smallest eigenvalue of A is taken as the ill-conditioning number , that is a 
measure of how much the system is ill-conditioned. Notice that whether an ill-conditioning 
number leads to negligible errors or not depends not only on the system but also on the 
accuracy required. 

Even in the case of huge ill-conditioning number, however, the problem of solving (2.2.1) 
is not ill-posed in a classical sense, since for arbitrarily small errors in the data, the solution 
is arbitrarily close to the exact solution. In practice, however,approximations involved in 
numerical computations lead to meaningless solutions, because the error in the data is not 
arbitrarily small. 

Let us consider now, more closely, the problems that could arise in numerical compu¬ 
tations: let i — 1,..., n be the eigenvalues of A. It is easy to see that 

1 

Xi = -r-yi I - 1 ,..., n 
^i 

will be the components of x, the solution of (2.2.1), after a suitable transformation of 
coordinates. If even small errors affect the entries of A, when some are sufficiently close to 
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zero, the corresponding components X{ of the solution can become arbitrarily large, leading 
to an unbounded solution. As a matter of fact, the errors arising from numerical computer 
approximations could be sufficient; therefore even numerical problems can be ill-posed. 


3. Shape constraints in regularization 

Ill-posed problems can be successfully turned into well-posed problems by means of very 
general regularizing techniques. As it is well known (Tikhonov 1943, 1963), these techniques 
rely on the assumption of some smoothness property of the possible solution. Sometimes, 
however, additional and useful constraints are available; for example the solution function 
may be necessarily non-negative or a monotonic function and so on. In this chapter, after 
discussing the role of compactness in regularization, we show which of these constraints can 
be embedded in the classical regularization theory and how. 

3.1. Role of compactness in regularization 

The role played by compactness (see Appendix A for its various definitions and properties) in 
the solution of ill-posed problems was clarified by Tikhonov with the following fundamental 
topological Lemma (Tikhonov and Arsenin, 1977): 

Lemma 3.1.1 Suppose that the operator A maps a compact set F C X onto the set U C Y, 
X and Y metric spaces. If A : F —► U is continuous and one-to-one, then the inverse 
mapping A\fj l is also continuous. 

By means of this Lemma, if the solution to equation (2.1.1) is known to belong to a 
compact subset of X, say F, and if the perturbed data is known to belong to U, U = {y £ 
Y,y = Ax}, then the problem of finding a solution to (2.1.1) is trivially well-posed with 
respect to F and U. In such a case the problem is said to be well-posed in the sense of 
Tikhonov. 
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Remark: The compactness requirement is a strong constraint on the set of possible solutions 
to a given problem: it is possible to produce examples in which well-posedness is guaranteed 
without any compactness requirement (Groetsch, 1984). 

If some a priori constraints on the shape of the solution are known and if these con¬ 
straints lead to the definition of a suitable compact set, the application of Lemma 3.1.1 is 
straightforward. This is the theme of the next sections. 

3.2. The selection method 

A useful method of finding an approximate solution to equation (2.1.1) is the selection 
method (Tikhonov and Arsenin, 1977). It consists in calculating the operator A for points 
belonging to a given sample set, looking for the minimum of || Ax — y\\ in a suitable norm. 
Such a method is powerful from a computational point of view since the sample set can 
be choosen so to depend only on a finite number n of parameters varying in finite limits. 
Obviously the computed solution x n and the exact solution x t (if x t exists) coincide if and 
only if xt belongs to the sample set. 

Suppose that increasing the number n of parameters (and therefore the dimension of 
the subspace containing the sample set) || Ax n — y\\ —>■ 0. Let us assume, therefore that 
|| Ax n — y|| —> 0 as n —► oo. It is easy to see that if the R(A) is not closed the approximate 
solution ||x n || —► oo, hence x n does not converge to x t . In order to guarantee the convergence 
of x n to Xt , compactness of the sample set is needed, so that Lemma 3.1.1 applies. If the 
sample set is not compact but it is closed and bounded, the Lemma 3.1.1 is still valid, 
though in a weaker sense. The solution x n , in fact, is only weakly convergent 1 to the true 
solution x t : it is also convergent in the usual sense if x t lies on the boundary of the sample 
set (Bertero, 1982). 


IThe solution x n is said to be weakly convergent to xt if ( x n ,y ) —► ( xt,y ) for n —*■ ooVt/ 6 X, 
where (') is a suitable dot product. 
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3.3. Regularization theory and shape constraints 

When no compact set containing the possible solution of (2.1.1) can be found, a new ap¬ 
proach is needed. A general and useful approach was also outlined by Tikhonov (1943, 1963) 
and is called standard regularization theory. Let us briefly summarize the main points of 
this theory. 

The fundamental concept of the theory is that of a regularizing operator. Suppose that 
the equation (2.1.1) allows x = xt as a solution when y = yt ; then an operator R(y,a) is 
called a regularizing operator for the equation (2.1.1) in a neighborhood of x = Xt if: 

i) 3(5i > 0 such that R(y,a) is defined Ve* > 0 and Vy 6 Y such that ||y — y*|| < Si; 

ii) there exists a function a = a(S) such that Ve > 0 3£ < <!>i such that Vy 


||y ~ yt II < h =*> ||a?t - x a || < e 

where x a = R(y,ot(6)) 

So the problem of finding a regularized solution to an ill-posed problem is shifted to 
that of finding methods to construct a regularizing operator. Let us see in some detail one 
of these methods. 

Construction of regularizing operators by minimization of a smoothing functional 

It is possible to construct a regularizing operator for (2.1.1) by minimizing the following 
functional with respect to x: 


'& a [x,y] = \\Ax-y\\+aQ,[x] (3.3.1) 

where Q is a stabilizing functional. A functional defined on 0 C D(A ) everywhere dense 
in D(A) is a stabilizing functional for the equation (2.1.1) if: 
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i) x t belongs to the domain of definition of fi; 

ii) Vd > 0, {x E O f2[a:] < d} is a compact subset of O. 

Indeed the following theorem holds: 

Theorem 3.3.1 Let A denote a continuous operator. For every y E Y and every a > 0, there 
exists a x a E O for which the functional 'L attains its minimum. 

As a matter of fact the choice of Q, can determine the uniqueness of the solution: for 
example if D(A) is a Hilbert space and A is linear, if O is quadratic, sufficient condition 
for the uniqueness of the regularized solution can be proved (Tikhonov and Arsenin, 1977). 
In principle, the regularization problem is completely solved. Sometimes, however, some 
additional constraints on the shape of the solution are available. Can we exploit them ? 
Indeed, the following Lemma holds: 

Lemma 3.3.2 Let X be a compact topological space. Then every closed subset of X is 
compact. 

Theorem 3.3.1 is based on the compactness of the subsets where fl is bounded and 
therefore is still true even if the set of possible solutions is a closed subset of D(A). Therefore, 
if the additional constraints lead to the definition of some closed subset of D(A), they can 
be easily exploited in the framework of regularization theory. 

Remark: these sets do not need to be compact. The regularizing scheme itself provides 
compactness of the set in which the solution is actually searched; if the constraints define a 
compact set, the Lemma 3.1.1 is sufficient to guarantee well-posedness of the problem. 

3.4. Compact subsets of functional spaces 

From the preceding sections, it turns out that given an ill-posed problem and some a priori 
constraints, it is important to determine whether such constraints define a compact subset 
or at least a closed subset of a suitable functional space. Let us examine some examples of 
subsets of L 2 and C°. 
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The set of bounded non-decreasing (non-increasing) functions is a compact set in L 2 . 
The proof (see Taylor, 1965, for example) relies upon the fact that the number of disconti¬ 
nuity points of a monotonic bounded function is at most enumerable. 

The set of convex functions is compact. This result follows trivially from the compact¬ 
ness of the set above, since each convex function is the integral of a suitable non-decreasing 
function. 

The set of bounded piece-wise constant functions is neither closed nor compact. It is 
not compact since it is everywhere dense in L 2 (which is trivially not compact). It is not 
closed since any continuous function is an accumulation point of this set. 

It is not easy to find compact subsets of C°. The set of bounded non-negative functions, 
for example, is not compact. Consider in C[0,1] 

T={x |a;(f)| < 1, t e [0,1]}. 

T is closed and bounded (obvious), but not compact. Indeed, let S = {xi} je 7 v be a sequence 
of functions with x n (t) = t n . Any subsequence of S cannot converge in T, since in C[0,1] 
the convergence is uniform convergence while t n —> 0, if 0 < t < 1, and t n = 1, if t = 1. So 
T is not compact. 

Remark: This counterexample shows that in C° even the sets of monotonic and convex 
functions are not compact. 

As a conclusion, the constraints of monotonicity and of convexity, defining compact 
subsets, can be useful in regularization either via the selection method or via standard tech¬ 
niques (since any compact set is closed, see Appendix A for detail). The positivity constraint 
can be used only as a shape constraint in classical regularization theory and in a weaker 
sense in the selection method, while piece-wise constant functions, though representing sig¬ 
nificative a priori knowledge on the shape of the solution, cannot be embedded in either of 
the frameworks. 
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4. Connection with mathematical programming (MP) 

Most of the problems faced in the framework of Hilbert spaces are in fact usually either 
intrinsically discrete problems or problems allowing only numerical solution. In this chapter 
the cases of the selection method and of the regularization with shape constraint, discussed 
in the preceding sections, are analyzed in this respect as examples of mathematical pro¬ 
gramming problems. 

4.1. Selection method as a MP problem 

As we have seen in the previous chapter, if the condition of section 3.1 applies, an ap¬ 
proximate solution to the equation (2.1.1) can be found by means of the selection method. 
In practice the problem has to be solved numerically: consider for example the Fredholm 
integral equation of the first kind 

b 

J K(t,s)x(s)ds — y(t) c<t<d (4.1.1) 

a 

where x(s) belongs to a set F of decreasing uniformly bounded functions. F is compact (see 
section 3.4), therefore if y(t) £ U — AF the problem is well-posed, in the sense of Tikhonov. 
In order to find an explicit solution we can replace the integral with a sum over a grid with 
n nodes. Let aq (i = l,...,n) be the value of the unknown vector x at the node i and 
Vj (j = l,...,m) the components of the data vector y. The problem is to find a bounded 
vector minimizing the functional 

m n 

^[ x > y] = II Jl( K P x i ~ yj )II 

j=l i=l 

under the constraint that the components of x are decreasing. It is easy to show that this 
constraint can be expressed as a positivity constraint on the values of the derivative of 
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the function at each node. In the discrete case this reduces to the fact that suitable linear 
combinations of the neighbor nodes have to be greater than zero. For example in the nearest 
neighbor approximation we have 


i Xi +L Xi ~^ >0 * = 2,...n-l. (4.1.2) 

z 

In these terms the problem is now a typical problem of quadratic programming (see 
Appendix B for main definitions and results of mathematical programming problems). In¬ 
deed in the general case the only problem concerns the explicit form of the constraints. It 
must be possible to write them as follows (see Appendix B ): 

flfi(x) < 0 i = l,...c (4.1.3) 

where are scalar functions (they need to be linear or at most quadratic to define a 
quadratic programming problem). Notice that (4.1.2) can be immediately rewritten like 
(4.1.3). Rutman and Cabral (1974) have shown that performing a suitable transformation, 
the constraint of monotonicity, convexity, unimodality and selective non-negativity can all 
be written in the form (4.1.2). In this case, as shown before, only the monotonicity and the 
convexity constraint can be properly used. As we will see in the next section, however, all 
of them are shape constraints that can be useful in regularization. 

4.2. Regularization with shape constraint as an MP problem 

Let us illustrate this section by means of the same example of the previous one. Again the 
problem is to solve the Fredholm integral equation of the first kind (4.1.1). This time since 
either the set F is not compact or y does not belong to U — AF , standard regularization 
techniques of the kind described in section 3.3 are needed. Suppose moreover that some 
further information is available and that they correspond to constraints on the soluiton 
defining closed subsets of the domain of the operator. If these constraints can be written in 
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the form (4.1.2) the problem of minimizing the discrete functional corresponding to (3.3.1) 
subject to such constraints is again a typical stable problem of quadratic programming. 

Remark: A generic mathematical programming problem, even if quadratic or linear, is not 
necessarily stable. As a matter of fact the well-posedness relies on the strong assumption 
that the functional to minimize is a stabilizing functional. If this is not the case, the problem 
has to be regularized following standard techniques (Tikhonov and Arsenin, 1977). 

Remark: While any regularized problem of the type described in section 3.3 gives rise 
to a well-posed mathematical problem, the application of Kuhn-Tucker theory and of the 
gradient method are subject essentially to the fulfillment of some convexity properties of 
the functions involved (see Appendix B) and therefore they are guaranteed only in the case 
of linear operators and a quadratic stabilizing functional. 


5. Conclusion 

In this note we analysed the role played by shape constraints in ill-posed problems. The 
key concept has been that of compact set. If the shape constraints lead to the definition of 
a compact set, regularization is straightforward. Indeed the shape constraint itself provides 
sufficient conditions for the continuity of the dependence of the solution on the data. If 
the shape constraints define at least a closed set, then they can be an useful addition to 
standard regularization approaches. While a suitable functional provides stability on the 
data, shape constraints allow to recover a solution closer to the correct one, by taking into 
account significative additional a priori knowledge on the shape of the solution. 

In both cases constraints that do not define at least a closed set cannot be embedded 
in the regularizing step. In particular this implies that the a priori knowledge concerning 
piece-wise constant or piece-wise continuous functions, though in principle significant for 
many early vision problems (the reconstruction of the 3D structure of a scene and the 
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recovery of the albedo for example) cannot be used within any classical regularizing schema. 
This is an additional argument that motivates the use of Markov Random Fields models for 
exploiting a priori information about discontinuities and their properties (see Marroquin 
et al. 1985). A different regularizing approach that can exploit constraints of this type, 
considering discrete and quantized formulations, will be discussed in a forthcoming paper 
(Poggio and Verri). 

Finally, the discrete problem that has to be faced solving an ill-posed problem has 
been analysed as a mathematical programming problem: in the interesting case of linear 
operators it becomes a standard stable problem of quadratic programming. In particular, 
all the results of convex programming regarding local and global convergence of the gradient 
method algorithm are guaranteed to apply. 
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Appendix A: Compact sets in topological and metric spaces 

As we have seen before the concept of compact set is fundamental in the regulariza¬ 
tion of ill-posed problems. Unfortunately, there exist different definitions of compact set. 
Disregarding historical problems, here is a summary of the main definitions and properties 
concerning compact sets that we adopted in this note. 

Let A be a topological space. An open covering of S C A is a family F of open sets in 
A such that S C (J igr i. 

S C A is compact if, for every open covering T of S, there exists a finite subfamily of 
r that also covers S. 

Remark: A closed set is not necessarily compact (consider the real line). A compact set is 
not necessarily closed. Compact sets are always closed in Hausdorff spaces (a topological 
space is a Hausdorff space if for each pair of distinct points x\ and X 2 , there exist two 
disjoint neighborhoods containing them). 

In topological spaces the following Lemma holds: 

Lemma A.l If T C A is compact, then for every infinite S C T, S' D T ^ 0. (S' is the set 
of accumulation points of S ). 

Notice that the converse of Lemma A.l is not true in general. Now let A be a metric 
space (and henceforth a topological Hausdorff space) then we have: 

Lemma A.2 If T C A and for every infinite S C T, 5" fl T ^ 0, then T is compact. 

Remark: Combining Lemma A.l and A .2 the usual definition of compactness in metric 
spaces can be obtained: a set S C A, A a metric space, is compact if for every sequence of 
points in S there is a subsequence converging to a point of S. 

Furthermore, in metric spaces the concept of boundedness can be defined, so that the 
following Lemma can be proved: 

Lemma A.3 If S C A is compact then S is closed and bounded. 
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The converse of Lemma A. 3 is not true in general (see section 3.4 for a counterexample). 

Remark: In the converse of Lemma A. 3 holds. Indeed in for the Borel theorem, any 
bounded set has an accumulation point: so, if it is closed, it is also compact. 

It follows that discretization makes a problem well-posed (but ill-conditioned possibly). 

Appendix B: Mathematical programming: Definitions and main results 

In Vision, when a regularized problem has to be solved, numerical methods, based on 
discretizing the original continuous formulation, are usually needed. These numerical meth¬ 
ods always lead to classical problems of mathematical programming. As may be expected in 
these cases Tikhonov regularization theory and mathematical programming theorems pre¬ 
dict the same results in terms of existence and uniqueness of the solution (see sections 4.1 
and 4.2). Here we review, for the sake of completeness, the main definitions and results of 
mathematical programming theory (for more details see Anow et al., 1958, for example). 

Let us consider the problem of finding a minimum for a given functional <p = <p( z) on 
a set G = {z / gi( z) < 0 i = l,...,m} where z = (zi,...,z n ) € L C !ft n and gi are scalar 
functions. If the functions <p and gi (i = 1, ...,m) are linear, the problem is called a linear 
programming problem , otherwise non-linear. In both cases it is a mathematical programming 
problem. 

Typically the problem of finding conditional extrema of a given functional is solved 
by means of the Lagrange multipliers theory. Classical theorems on Lagrange multipliers 
provide only necessary conditions for the existence of such multipliers: Kuhn-Tucker theory, 
in turn, fills the gap, providing sufficient conditions for their existence (obviously closely 
related to the existence of extrema of functionals). This theory, therefore, is useful in 
most of the mathematical programming problems. Let us review briefly the main results of 
Kuhn-Tucker theory. 


Kuhn-Tucker theory 
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Let us call the conditional problem stated above P. 1 and associate with it the following 
Lagrangian form 


m 

$(z, w) = (p(x) + Y w i9i(z) 
i =1 

where w = (uq, w m ) with W{ 6 * = 1, m. It is easy to see that if the pair (z', w') 

is a saddle point for the above Lagrangian form z' is a solution of P.l. Let us call P.2 the 
problem of finding a saddle point for the Lagrangian form $, thus the following Lemma 
holds: 

Lemma B.l Given P.l and P.2, if the pair (z', w') is a solution to P.2 then z' is a solution 
to P.l. 

To prove the converse of Lemma P.l, i.e. to show the equivalence between P.l and P.2, 
some constraints on the functions ip and gi, i — 1, ...,m are needed; more precisely: 

Theorem B.2 (Kuhn-Tucker) Let <p( z) and < 7 ,(z), i = l,...,m be convex on Z = {z / > 

0, i = l,...,n}. If there exists z° (E Z such that gi( z°) < 0, i = 1,..., m, then z' is a solution 
to P.l if and only if 3 w' such that the pair (z', w') is a solution to P.2. 

In the case of C 1 functions the celebrated Kuhn-Tucker conditions can be introduced. 
They guarantee necessary conditions for the existence of a solution to a saddle point problem. 
Under convexity assumptions the Kuhn-Tucker conditions become sufficient, henceforth 
guaranteeing the existence of a solution to the associated mathematical problem. (If <p 
is strictly convex it also turns out that the solution is unique). In obvious notation they 
are: 


5 $ 


2 >>^ (z '’ w ' ) = 0 

j =i 3 


dwj 


< 0 , j = 


1 ,.. 


m 
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wj > 0, j = 1 ,m 


E 


,a$ 

Zi dzi 


(z',w') - 0 


— > 0, * = 1,n 


Zi > 0 , * = 1 , n. 


Gradient method 

Let us now review briefly the gradient method , which is one of the most useful methods 
for finding saddle points of a given function. It consists essentially in finding the solution of 
the following system S of differential equations 


dzi . .. 

—r— = 0 if —— > 0 and = 0 

at ozi 


dzi d$ 


dt dzi 


otherwise; * = l,...,n 


= o if P- 

dt ow ,• 


< 0 and Wj = 0 


dwj < 9 $ 


otherwise; j = 1,..., m 


dt dwj 

where t is a parameter. Now if the pair (z', w') is a saddle point for $(z, w) it follows that: 


<9$ 

g-(z',w')>0 


i = 1,..., n 
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g—(z',w')<0 j = 1, 

In particular if J^-(z',w') > 0 then z[ — 0 and if -^-(z 1 , w') < 0 then w'- = 0. Without 
loss of generality (just for notational convenience) suppose in the sequel that for i — 1, ...,p, 
ip < n) ff-(z',w') = 0, while for i = p + 1 J^-(z',w') > 0 and that for j = 

(q < m) ^-(z', w') = 0, while for j = q + 1 ,m ^-(z', w') < 0. 

The following theorem, now, guarantees local convergence of the gradient method. 

Theorem B.S Let $(z,w) have a saddle point (z',w') under the constraint z Z where 
Z — { z / Zi > 0,* = l,...,n) and w €E W where W = {w / wj > 0,j = l,...,m} and 
let $ be analytic in some neighborhood of (z', w'). Suppose further that the matrix of the 
second derivative of $ in the first p components of z defines a positively defined form and 
that Zi > 0, i = 1, ...,p and Wj > 0, j = 1,..., q. Then for any pair (z", w") in a sufficiently 
small neighborhood of (z', w'): 

i) there is a unique solution z = z(t, z", w") and w = w(t, z", w") to the system S such 

that: 

ii) lim^oo z(t,z", w") = z' and 

in) in any limit point w° of the function w = ( t , z", w") as t —> 00 , the pair (z', w°) is 
saddle point of $(z, w). 

Remark: The classical theorems of existence and uniqueness of the solution for differential 
system of equation cannot be used, since no assumption is actually made on the continuity 
of the derivatives of the variables. 

Before stating the theorem on global stability of the gradient method the following 
definition is needed: 

Zi(t), i = l,...,n and Wj(t) j = 1 ,...,m, solution of the system S are a regular solution if 
when Zi(t u ) = 0, i = 1 ,...,n and Wj(t v ) = 0, j — 1 ,...,m with u E N for some sequence 
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{t„ } such that t u > 0 and limj / _ >00 t u = 0, there is some t > 0 such that = 0, i = 1, ...,n 
and = 0, j = 1 ,m for 0 < t < t. 

Theorem B-4 Let <f>(z, w) be a strictly convex, continuous and twice differentiable function 
in z 6 Z and w E W. Let the system S have a regular solution with respect to any 
pair (z",w") where z" E Z and w" E W. Then there is a unique regular solution of the 
system S with any initial position. Furthermore if $ has a saddle point in (z', w') under the 
constraints z E Z and w E W, z' is uniquely determined and any solution of S converges 
to z'. 

Rem,ark: Actually by introducing suitable strictly increasing functions pj , j = 1 ,m of one 
variable such that pj( 0) — 0, j — 1, the condition of strict convexity in theorem BA 

can be relaxed to convexity if one applies the gradient method to the modified Lagrangian 
form: 


m 

$p(z, w) = <p( z) + Y; Wjpfigfiz)]. 

3 =1 

In conclusion Theorem B.4 guarantees global convergence of the gradient method for 
convex programming (including therefore the important case of quadratic programming); 
the modified Lagrangian form above allows the successful extension of the gradient method 
to the broad class of linear programming problems. 
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